Compiler Optimization

ABSTRACT

Provides effective use of architecture-specific instructions. There is provided a compiler including: a target partial program detecting unit for detecting, from among a partial programs of the program to be optimized, a partial program including instructions corresponding to all instructions included in the pattern to be replaced as a partial program to be optimized; an instruction sequence transforming unit for transforming, in the partial program to be optimized, instructions other than those instructions corresponding to instructions included in the pattern to be replaced and those instructions having execution dependencies different from the pattern to be replaced, so that dependencies between instructions included in the partial program to be optimized match the pattern to be replaced; an instruction sequence replacing unit for replacing the partial program to be optimized transformed by the instruction sequence transforming unit with a target instruction sequence determined in accordance with the pattern to be replaced.

FIELD OF THE INVENTION

The present invention relates to a compiler, an optimization method, acompiler program, and a recording medium. In particular, the presentinvention relates to a compiler, an optimization method, a compilerprogram, and a recording medium that replace an instruction arrangementpattern that is known to be optimizable with a target instructionsequence corresponding to the arrangement pattern.

BACKGROUND

There has been proposed a technique of detecting an instruction sequencematching a predetermined pattern from a program to be optimized andreplacing the instruction sequence with another instruction sequencedetermined in advance in accordance with the pattern. This technique canoptimize a program, for example, by replacing a sequence of instructionsfor performing a certain kind of processing with a single instructionproducing the same processing result as the processing performed by thesequence of instructions. The instruction which replaces the sequence ofinstructions is, for example, a TRT instruction in the S/390architecture provided by IBM Corporation.

The following are documents are referred to and/or considered withrespect to an embodiment:

-   -   [Non-Patent Document 1]    -   Jianghai Fu. Directed graph pattern matching and topological        embedding. Journal of Algorithms, 22(2):372-391, February 1997.    -   [Non-Patent Document 2]    -   S. S. Muchnick. Advanced compiler design and implementation,        Morgan Kaufmann Publishers, Inc., 1997.    -   [Non-Patent Document 3]    -   Arvind Gupta and Naomi Nishimura. Finding Largest Subtrees and        Smallest Supertrees, Algorithmica, Vol. 21, No. 2, pp. 183-210,        1998    -   [Non-Patent Document 4]    -   http://publibz.boulder.ibm.com/epubs/pdf/dz9zr002.pdf, pp. 7-180

A TRT instruction is an instruction to scan a predetermined storage areain order from the top and output an address or the like at which a valuesatisfying a predetermined condition is stored (see Non-Patent Document4). FIG. 16 is a control flow graph corresponding to processingaccording to a TRT instruction. The processing by means of the TRTinstruction corresponds to a sequence of processing steps by whichvalues stored in a storage area byte array are read out in order fromthe top of the storage area to a variable ch, and which ends when one ofconditions cond1 to condN is satisfied. A compiler may replace such aprocessing sequence with a single TRT instruction to optimize a program.

DISCLOSURE OF THE INVENTION AND PROBLEMS SOLVED BY THE INVENTION

However, it is rare that a program to be optimized completely matches apredetermined pattern. If such a match does not occur, optimization isabandoned in the conventional art. Therefore there has been apossibility of failure to effectively utilize an instruction such as aTRT instruction specific to an architecture.

It is, therefore, an object of the present invention to provide acompiler, an optimization method, a compiler program, and a recordingmedium as a solution to the above-described problem. This object can beattained by a combination of features described in the independentclaims in the appended claims. In the dependent claims, furtheradvantageous examples of the present invention are specified.

SUMMARY OF THE INVENTION

To solve the above-described problem, according to a first aspect of thepresent invention, there is provided a compiler detecting a pattern thatis to be replaced. The compiler includes multiple predeterminedinstructions in a program to be optimized, and replaces the detectedpattern to be replaced with a target instruction sequence determined inaccordance with the instruction sequence to be replaced.

The compiler has: a target partial program detecting unit for detecting,from among partial programs of the program to be optimized, a partialprogram including instructions corresponding to all instructionsincluded in the pattern to be replaced, as a partial program to beoptimized; an instruction sequence transforming unit for transforming,in the partial program to be optimized, instructions other than thoseinstructions corresponding to instructions included in the pattern to bereplaced and those instructions having execution dependencies differentfrom the pattern to be replaced so that dependencies betweeninstructions included in the partial program to be optimized match thepattern to be replaced; and an instruction sequence replacing unit forreplacing the partial program to be optimized transformed by theinstruction sequence transforming unit with the target instructionsequence determined in accordance with the pattern to be replaced. Thus,the present invention allows architecture-specific instructions to beused effectively.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other aspects of these teachings are made more evidentin the following detailed description of the invention, when read inconjunction with the attached drawing figures, wherein:

FIG. 1 is a functional block diagram of a compiler 10;

FIG. 2 shows a concrete example of a pattern 20 to be replaced and apartial program 40 to be optimized;

FIG. 3 shows a concrete example of the partial program 40 to beoptimized and a partial program 50 to be optimized corresponding to FIG.2( e);

FIG. 4 is a flowchart showing a process in which the compiler 10optimizes the program to be optimized;

FIG. 5 shows a first example of the pattern 20 to be replaced and atarget instruction template 30;

FIG. 6 shows a first example of the partial pattern 40 to be optimizedby the compiler 10;

FIG. 7 shows a resultant partial program 60 in the first example;

FIG. 8 shows a second example of the partial pattern 40 to be optimizedby the compiler 10;

FIG. 9 shows the resultant partial program 60 in the second example;

FIG. 10 shows a third example of the partial pattern 40 to be optimizedby the compiler 10;

FIG. 11 shows the resultant partial program 60 in the third example;

FIG. 12 shows a fourth example of the partial pattern 40 to be optimizedby the compiler 10;

FIG. 13 shows the resultant partial program 60 in the fourth example;

FIG. 14 is a diagram for explaining the effect of the embodiment;

FIG. 15 shows an example of the hardware configuration of a computer 500which functions as the compiler 10; and

FIG. 16 is a control flow graph corresponding to processing according toa TRT instruction.

DESCRIPTION OF SYMBOLS

-   -   10 . . . Compiler    -   20 . . . Pattern to be replaced    -   30 . . . Target instruction template    -   40 . . . Partial program to be optimized    -   50 . . . Partial program to be optimized    -   60 . . . Resultant partial program    -   100 . . . Optimization candidate detecting unit    -   110 . . . Target partial program detecting unit    -   120 . . . Instruction sequence transforming unit    -   130 . . . Instruction sequence replacing unit

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides a compiler detecting a pattern that is tobe replaced and includes multiple predetermined instructions in aprogram to be optimized, and replacing the detected pattern to bereplaced with a target instruction sequence determined in accordancewith the instruction sequence to be replaced, the compiler having atarget partial program detecting unit for detecting, from among partialprograms of the program to be optimized, a partial program includinginstructions corresponding to all instructions included in the patternto be replaced, as a partial program to be optimized, an instructionsequence transforming unit for transforming, in the partial program tobe optimized, instructions other than those instructions correspondingto instructions included in the pattern to be replaced and thoseinstructions having execution dependencies different from the pattern tobe replaced so that dependencies between instructions included in thepartial program to be optimized match the pattern to be replaced, and aninstruction sequence replacing unit for replacing the partial program tobe optimized transformed by the instruction sequence transforming unitwith the target instruction sequence determined in accordance with thepattern to be replaced.

It is noted that not all the necessary features of the invention arelisted. Subcombinations of the features can also constitute the presentinvention. The present invention allows architecture-specificinstructions to be used effectively.

ADVANTAGEOUS EMBODIMENTS

The present invention will be described with respect to embodimentsthereof. The embodiment described below, however, is not limited theinvention set forth in the appended claims, and all combinations offeatures described in the description of the embodiment are notnecessarily indispensable to the solution according to the presentinvention.

FIG. 1 is a functional block diagram of a compiler 10. The compiler 10detects a pattern which is to be replaced and which has multiplepredetermined instructions, and replaces the detected pattern to bereplaced with a target instruction sequence determined in accordancewith the pattern to be replaced. The target instruction sequence is aninstruction sequence that is executed more efficiently than the patternto be replaced and includes, for example, an architecture-specifichigh-speed instruction. That is, the purpose is to optimize the programto be optimized into an instruction sequence which is executed moreefficiently.

The compiler 10 has an optimization candidate detecting unit 100, atarget partial program detecting unit 110, an instruction sequencetransforming unit 120, and an instruction sequence replacing unit 130.The optimization candidate detecting unit 100 detects a candidate for apartial program which is an object to be optimized. For example, theoptimization candidate detecting unit 100 detects, in a program to beoptimized, a partial program including a memory access instruction toaccess the same type of data as data to be accessed according to amemory access instruction included in the pattern to be replaced. Thepartial program may be a processing unit of the program called a method,a function or a procedure, or may be an instruction sequence such asloop processing determined on the basis of a characteristic of a controlflow.

The partial program detecting unit 110 detects as a partial program 40to be optimized a partial program similar to a pattern 20 to be replacedin multiple partial programs detected by the optimization candidatedetecting unit 100. For example, the partial program detecting unit 110detects as partial program 40 a partial program including instructionscorresponding to all instructions included in the pattern 20. Morespecifically, the partial program detecting unit 110 determines, withrespect to two instructions, that the instructions correspond to eachother if the processing details according to the instructions areidentical to each other, if the number of control flows output from theinstructions are equal to each other, and if instructions at transitiondestinations of the control flows are identical to each other.

The instruction sequence transforming unit 120 transforms, in thepartial program 40, instructions other than those instructionscorresponding to instructions included in the pattern 20 and thoseinstructions having execution dependencies different from the pattern tobe replaced so that dependencies between instructions included in thepartial program 40 match the pattern 20. The instruction sequencetransforming unit 120 may transform other instructions if necessary. Thetransformed partial program to be optimized is set as a partial program50 to be optimized.

The instruction sequence replacing unit 130 replaces the partial program50 transformed by the instruction sequence transforming unit 120 with atarget instruction sequence determined in accordance with the pattern20. For example, the compiler 10 generates a target instruction sequenceby replacing each variable in a target instruction template 30 showingthe structure of the target instruction sequence with a correspondingvariable in the partial program 50. As a result, the compiler 10 outputsas a resultant partial program 60 the program to be optimized includingthe target instruction sequence.

FIG. 2( a) shows a concrete example of the pattern 20. The pattern 20has an instruction A, an instruction B, an instruction C and aninstruction D. The pattern 20 determines execution dependencies betweenthe instruction A, instruction B, instruction C and instruction D. Theexecution dependencies between the instructions are, for example,control flows between the instructions. According to the control flows,the instruction B is executed after execution of the instruction A, theinstruction C is executed after execution of the instruction B, and theinstruction D is executed after execution of the instruction C. Theinstruction A is again executed after execution of the instruction D.

The pattern 20 may alternatively determine control dependences or datadependences between instructions. Also, the pattern 20 may be a PDG(program dependence graph) which is a dependence graph determining bothcontrol dependences and data dependences. That is, the pattern 20 may bea dependence graph having as a node each of multiple instructionsincluded in the pattern 20 and having directed edges representingexecution dependences between multiple instructions.

FIG. 2( b) shows a first example of the partial program 40. This partialprogram includes instruction A, instruction B, instruction C,instruction D and instruction A. Accordingly, this partial programincludes instructions corresponding to all the instructions included inthe pattern 20. The target partial program detecting unit 110 thereforedetects this partial program as the partial program 40. Thus, in a casewhere the pattern 20 determines recurring dependences betweeninstructions, the target partial program detecting unit 110 can detect,as the partial program 40, an instruction sequence having dependencieswhich are the same as those determined by the pattern to be replaced 20but differ in recurrence phase.

In this case, the instruction sequence replacing unit 130 replaces thepartial program 40 with the target instruction sequence, without thepartial program 40 being transformed by the instruction sequencetransforming unit 120. The target partial program detecting unit 110 maydetect, as well as this example of instruction sequence, as the partialprogram 40 to be optimized, an instruction sequence having dependenceswhich are the same as those determined by the pattern 20 and whichappear in the same recurrence phase (a completely matching sequence).Also in this case, the instruction sequence replacing unit 130 replacesthe partial program 40 with the target instruction sequence, without thepartial program 40 being transformed by the instruction sequencetransforming unit 120.

FIG. 2( c) shows a second example of the partial program 40. Thispartial program includes instruction A, instruction B, instruction C andinstruction D. Accordingly, this partial program includes instructionscorresponding to all the instructions included in the pattern 20. Thetarget partial program detecting unit 110 therefore detects this partialprogram as the partial program 40. Thus, the target partial programdetecting unit 110 can detect, as the partial program 40, a partialprogram including an instruction sequence executed in a certain orderdifferent from that of the execution dependences in the pattern 20.

In this case, the instruction sequence transforming unit 120 changes theorder of execution of the instructions in the partial program 40 on thebasis of the dependences on condition that the results of processing bythe partial program 40 are not changed after changing the order ofexecution of the instructions in the partial program 40. Morespecifically, the instruction sequence transforming unit 120interchanges the positions of the instruction B and the instruction C inthe execution order if the instruction B does not depend on the resultof processing according to the instruction C. The partial program 50 isthereby produced. The instruction sequence replacing unit 130 replacesthe partial program 50 changed in instruction execution order, with thetarget instruction sequence.

FIG. 2( d) shows a third example of the partial program 40. This partialprogram includes instruction A, instruction B, instruction C,instruction D and instruction E. Accordingly, this partial programincludes instructions corresponding to all the instructions included inthe pattern 20. The target partial program detecting unit 110 thereforedetects this partial program as the partial program 40. Thus, the targetpartial program detecting unit 110 can detect, as the partial program40, a partial program including in loop processing an additionalinstruction E which does not correspond to any of the an instructionsincluded in the pattern 20.

In this case, the instruction sequence transforming unit 120 makes atransformation such that the additional instruction is executed out ofthe loop processing, on condition that the result of execution of theadditional instruction included in the loop processing of the partialprogram 40 is constant independently of repetition of the loopprocessing. Alternatively, the instruction sequence transforming unit120 may divide the loop processing of the partial program 40 into twoloop processings in which the additional instruction and the instructionsequence other than the additional instruction are respectivelyexecuted. Division of loop processing will not be described since it iswell known from Non-Patent Document 2. The instruction sequencereplacing unit 130 replaces the loop processing from which theadditional instruction has been removed with the target instructionsequence.

FIG. 2 shows a fourth example of the partial program 40. This partialprogram includes instruction A, instruction B and instruction D. Thispartial program lacks some of the instructions included in the pattern20 (i.e., instruction C). In this case, the target partial programdetecting unit 110 first computes the proportion of the instructions inthe partial program corresponding to the other instructions included inthe pattern 20 in all the instructions included in the pattern 20. Thetarget partial program detecting unit 110 then detects the partialprogram as the partial program 40, on condition that the computedproportion is higher than a predetermined reference proportion.Processing in this case will be described with reference to FIG. 3.

FIG. 3 shows a concrete example of the partial program 40 and thepartial program 50 corresponding to FIG. 2( e). The instruction sequencetransforming unit 120 adds to the partial program 40 the instruction Cwhich is the absent instruction absent in the instructions in thepartial program 40 corresponding to all the instructions included in thepattern 20. The instruction sequence transforming unit 120 generates acancel instruction to return the result of processing of the partialprogram 40 changed by the addition of the absent instruction to theprocessing result obtained in the case where the absent instruction isnot added. Instruction C⁻¹ represents this cancel instruction. Theinstruction sequence transforming unit 120 may generate a set of twoinstructions respectively executed before and after the instruction C tocancel out the effect of the addition. Instruction C⁻¹ _(before) andinstruction C⁻¹ _(after) represent these instructions.

For example, the instruction sequence transforming unit 120 generates asinstruction C⁻¹ _(before) a save instruction to save, before theinstruction C, the value in a storage area in which the result ofprocessing according to the instruction C is stored. The instructionsequence transforming unit 120 also generates as instruction C⁻¹_(after) a recovery instruction to recover the value in the storage areaafter execution of the instruction C. FIG. 3( a) shows the generatedpartial program 50. The instruction sequence transforming unit 120 makesa transformation such that the save instruction and the recoveryinstruction are executed out of the generated partial program 50 to beoptimized. FIG. 3( b) shows the transformed partial program 50. In thiscase, the instruction sequence replacing unit 130 replaces the partialprogram 50 including the instruction C with the target instructionsequence.

Preferably, the target partial program detecting unit 110 computes, withrespect to each of partial programs, an estimate of the processing timeincreased in a case where an absent instruction and a save instructionor the like to the partial program. The target partial program detectingunit 110 also computes an estimate of the reduced processing time in acase where the partial program is replaced with the target instructionsequence by the instruction sequence replacing unit 130. The targetpartial program detecting unit 110 then detects the partial program asthe partial program to be optimized, if the increased processing time isshorter than the reduced processing time, thus optimizing only theportion transformable to improve the efficiency.

Further, for example, in a case where a comparison instruction includedin the pattern 20 and a comparison instruction included in the partialprogram 40 differ only in a variable to be compared, the instructionsequence transforming unit 120 may change only a constant, with whichthe variable is to be compared, in the comparison instruction includedin the partial program 40. For example, in a case where the pattern 20includes an instruction “switch(ch)” and the partial program 40 includesan instruction “switch(ch+1)”, the instruction sequence transformingunit 120 makes a transformation by reducing 1 from a constant of a casestatement in the partial program 40. The instruction sequencetransforming unit 120 transforms the instruction “switch(ch+1)” in thepartial program 40 into the instruction “switch(ch)”. Consequently, theinstruction sequence replacing unit 130 can match the instructionincluded in the partial program 40 to the pattern 20.

FIG. 4 is a flowchart showing a process in which the compiler 10optimizes a program to be optimized. The optimization candidatedetecting unit 100 detects a partial program as a candidate to beoptimized (S400). Description will be made of a concrete example. Forexample, the optimization candidate detecting unit 100 first detects, asa candidate to be optimized, a partial program including a memory accessinstruction to access the same type of data as data to be accessedaccording to a memory access instruction included in the pattern 20.

For example, in a case where the pattern 20 includes a load instruction,the optimization candidate detecting unit 100 determines that a partialprogram is a candidate to be optimized, on condition that the partialprogram includes the load instruction. Similarly, in a case where thepattern 20 includes a store instruction, the optimization candidatedetecting unit 100 determines that a partial program is a candidate tobe optimized, on condition that the partial program includes the storeinstruction. Types of data to be accessed include types indicating kindsof data (an array variable, an instance variable, and a class variable)as well as “byte”, “int”, “float” and “double” which are typesindicating ranges of data expression.

In a case where the pattern 20 includes loop processing, theoptimization candidate detecting unit 100 detects a partial programincluding loop processing as a candidate for a partial program to beoptimized. The loop processing is an instruction sequence correspondingto strongly connected components in a case where the program isexpressed as a control flow graph. Also, the optimization candidatedetecting unit 100 detects a partial program as a candidate to beoptimized, further on condition that the partial program includes loopprocessing having the same increment in a loop induction variable asthat in the loop processing included in the pattern 20.

Also, the optimization candidate detecting unit 100 may detect a partialprogram as a candidate to be optimized, further on condition that theloop processing is repeated a number of times equal to or larger than apredetermined reference number of times. The above-described processingnarrows down the range in which optimization is tried, thus reducing theprocessing time required for compilation. As a result, the facility withwhich the technique described with respect to this embodiment is appliedto a dynamic compiler such as a just-in-time compiler.

Preferably, in a case where an optimization level indicating the degreeof optimization needed by a user is set, the optimization candidatedetecting unit 100 changes, according to the optimization level, acriterion for detection of a candidate to be optimized. For example, ina case where a higher optimization level is set, the optimizationcandidate detecting unit 100 detects a larger number of partial programsas a candidate to be optimized in comparison with that in a case where alower optimization level is set. Further, the optimization candidatedetecting unit 100 may omit processing in S400, for example, dependingon a setting made by the user.

Subsequently, the target partial program detecting unit 110 detects, asa partial program to be optimized, a partial program includinginstructions corresponding to all the instructions included in thepattern 20 in the partial programs detected as a candidate to beoptimized (S410). A concrete example of this processing will bedescribed with respect to a case where the pattern 20 includes loopprocessing. The target partial program detecting unit 110 determines thecorrespondence between instructions with respect to instructions in theloop processing and makes no determination as to coincidence betweendependences. On the other hand, the target partial program detectingunit 110 determines not only the correspondence between instructions butalso the coincidence between dependences with respect to instructionsout of the loop processing.

That is, if the target partial program detecting unit 110 determines,with respect to each of the partial programs, that the partial programincludes in the loop processing the instructions corresponding to allthe instructions included in the loop processing, and that all theinstructions out of the loop processing in the partial program conformto the dependences determined by the pattern 20, it detects the partialprogram as a program to be optimized. In this way, loops etc. having thesame dependences but differing in recurrence phase can be suitablydetected.

Description will be made of further details. The target partial programdetecting unit 110 first generates a dependence graph in which each ofmultiple instructions included in each of the partial programs is set asa node and execution dependences between multiple instructions arerepresented by directed edges. The target partial program detecting unit110 then makes a determination as to correspondence in the form betweenthe generated dependence graph and the dependence graph indicating thepattern 20, by means of an algorithm for determination as to graph formcorrespondence.

The target partial program detecting unit 110 may detect the same typeof dependence graph as the pattern 20, for example, by the topologicalembedding technique described in Non-Patent Document 1. Alternatively,the target partial program detecting unit 110 may detect the dependencegraph corresponding in form to the pattern 20, by detecting a piece ofprogram having the largest common portion in common with the pattern 20on the basis of the method described in Non-Patent Document 2. Each ofthese techniques allows determination of correspondence in the form evenin a case where an arbitrary node is included between the nodes in thedependence graph of the pattern 20. Therefore, the instruction sequenceshown in FIG. 2( d) can be detected as a partial program to beoptimized.

Also, the dependence graph with respect to loop processing is handled asa tree structure extending infinitely. Then, with respect to (b),A->B->C->D->A->B->C-> . . . is determined to find correspondence in theform. With respect to (c), the loop is developed to obtainA->C->B->D->A->C->B->D->A . . . . This algorithm allows an arbitrarynode to be included between the nodes, as mentioned above. Thus, theunderlined portions are connected to A->B->C->D. Therefore,correspondence in the form to the pattern is also determined withrespect to this structure.

Thus, when the target partial program detecting unit 110 determines,with respect each of the partial programs, that the instructions in thepartial program corresponding to all the instructions included in thepattern 20 are executed in the execution order designated by theexecution dependences between the instructions in the pattern 20, it candetect the partial program as a partial program to be optimized. In thisway, each of the instruction sequences shown in FIGS. 2( b), 2(c), and2(d), for example, can be detected as a partial program to be optimized.

The topological embedding technique may be extended by a methoddescribed below to enable the target partial program detecting unit 110to detect the instruction sequence shown in FIG. 2( d) as a partialprogram to be optimized. More specifically, in the topological embeddingalgorithm, a portion which determines that a partial program lacks anode corresponding to one of the nodes in the pattern 20 is changed sothat it determines that the node has been detected regardless of theactual lack of the node.

Each time the absence of one of the nodes is detected, the targetpartial program detecting unit 110 records information foridentification of the node to obtain a set of absent nodes. As a result,the target partial program detecting unit 110 can compute the proportionof the instructions in the partial program corresponding to the otherinstructions included in the pattern 20 in all the instructions includedin the pattern 20. Further, by means of this algorithm, the targetpartial program detecting unit 110 can detect an instruction sequencehaving two or more of the characteristics shown in FIGS. 2( b) to 2(e).

Subsequently, the instruction sequence transforming unit 120 transforms,in the partial program to be optimized, instructions other than thoseinstructions corresponding to instructions included in the pattern 20and those instructions having execution dependencies different from thepattern 20 so that dependencies between instructions included in thepartial program 40 match the pattern 20 S420). The instruction sequencereplacing unit 130 replaces the transformed partial pattern 50 with thetarget instruction sequence determined in accordance with the pattern tobe replaced (S430). It is not necessarily possible that all instructionsequences detected by the target partial program detecting unit 110 willbe replaced with target instruction sequences. That is, in some cases,the instruction sequence transforming unit 120 fails to transform thepartial program 40 and the instruction sequence replacing unit 130 failsto replace the instruction sequence.

Four examples of a process in which the compiler 10 is supplied with aprogram to be optimized and optimizes the program will be describedsuccessively.

FIRST EXAMPLE

FIG. 5( a) shows the pattern 20 to be replaced in the first example.

The pattern 20 is a pattern to be replaced for detection of aninstruction sequence shown in FIG. 16. In the instruction sequence shownin FIG. 16, the destination of branching from a switch instruction (2)is variable among numbers from 2 to 256. It is, therefore, thought thatthere is a need to generate 255 patterns 20 having the correspondingnumber of branching destinations 2 to 256 in order to suitably detectthe switch instruction. However, it is inefficient to compare the largenumber of patterns 20 and partial programs because a long processingtime is required for the comparison there between.

Then, the target partial program detecting unit 110 detects partialprograms to be optimized by using the illustrated pattern 20. Thispattern 20 has, with respect to a multiple-branch instruction (e.g.,switch instruction (2)) to hand over control to an external instructionout of the pattern 20 in a case where one of multiple conditions issatisfied, a representative edge representative of multiple controlflows through which control is handed over from the multiple-branchinstruction to the external instruction.

The target partial program detecting unit 110 determines that a partialprogram includes the corresponding multiple-branch instruction, if thedependence graph showing control flows of the partial program has anedge corresponding to the representative edge. That is, the targetpartial program detecting unit 110 determines that the multiple-branchinstructions correspond to each other, on condition that the number ofedges of the multiple-branch instruction of the partial program islarger than the number of edges of the multiple-branch instruction ofthe pattern 20.

FIG. 5( b) shows a target instruction sequence template 30 in the firstexample. This figure shows a program source code indicating details ofprocessing according to instructions included in the target instructionsequence template 30. In actuality, the target instruction sequencetemplate 30 may be described by means of a predetermined intermediatecode or a machine language.

In the target instruction sequence template 30, a variable “bytearray”represents an address in a storage area in which a value with whichcomparison is made is stored by a TRT instruction. A variable “i”represents an index for scanning the storage area. The instructionsequence replacing unit 130 secures the storage area for storing anumber of values equal to the value of the variable “bytearray”, andstores in a variable “table” the address stored in the storage area. Forexample, when the value of index i in the “bytearray” storage areasatisfies a condition for termination of the loop, the value of index iin the “table” storage area is a non-zero value.

FIG. 6 shows a first example of the partial program 40 to be optimizedby the compiler 10. The program shown in FIG. 6( a) is a source programrepresenting details of processing in accordance with the partialprogram 40. FIG. 6( b) is a control flow graph of the partial program40. According to the partial program 40, a storage area determined by avariable “data” is scanned in order and loop processing is terminatedwhen a constant “GREATERTHAN” or a constant 0 is detected. When theconstant 0 is detected, a return from a method call is made.

As is apparent from comparison between this figure and FIG. 5( a), thepattern 20 and the partial program 40 have different orders ofprocessing according to instructions to read out the value from thestorage area. More specifically, while the value is read out from thestorage area by instruction (1) according to the pattern 20, the valueis read out from the storage area by instruction (a) and instruction (e)according to the partial program 40. According to the conventional art,correspondence in the form between programs can be determined withoutconsidering a difference in variable name for example, butcorrespondence in form between programs cannot be determined if theplacement of instructions is changed. That is, the partial program 40shown in this figure cannot be detected as a program to be optimized.

In contrast, the compiler 10 in this embodiment is capable of detectingcorrespondence between instruction (1) in FIG. 5( a) and the instruction(a) in FIG. 6( b) and correspondence between instruction (1) and theinstruction (e) in FIG. 6( b). The compiler 10 is also capable ofdetecting correspondence between instruction (2) in FIG. 5( a) and theinstruction (b) in FIG. 6( b) and correspondence between instruction (2)and the instruction (c) in FIG. 6( b). Further, the compiler 10 iscapable of detecting correspondence between instruction (3) in FIG. 5(a) and the instruction (d) in FIG. 6( b).

The instruction sequence transforming unit 120 obtains a detectionresult showing that instruction (b) and instruction (c) are successivelyexecuted and that instructions (b) and (c) are transformable intoinstruction (2) in FIG. 5( a). The instruction sequence transformingunit 120 also obtains a detection result showing that the pattern 20 andthe partial program 40 have the same dependence of the variable ch. Theinstruction sequence transforming unit 120 further obtains a detectionresult showing that the pattern 20 and the partial program 40 have thesame dependence with respect to the index variable in the storage area.The instruction sequence transforming unit 120 further obtains adetection result showing that the partial program 40 does not include anadditional instruction in comparison with the pattern 20. If all theabove-described conditions are satisfied, the instruction sequencereplacing unit 130 replaces the partial program 40 with the targetinstruction sequence based on the pattern 20.

FIG. 7 shows the resultant partial program 60 in the first example. Theinstruction sequence replacing unit 130 generates a storage areaindicated by the variable “table” when a target instruction sequence isgenerated. For example, if the value of index i in the variable “data”storage area is GREATERTHAN or 0, the value of index i in the “table”storage area is an on-zero value. The instruction sequence replacingunit 130 initializes other values in the “table” storage area to 0.

As is apparent from this processing, the instruction sequence replacingunit 130 can optimize the instruction sequence if determination as towhether or not the loop processing is terminated is made on the basis ofthe value of index i. Therefore, the target partial program detectingunit 110 may detect a partial program as the partial program 40, oncondition that determination as to whether or not the loop processing isterminated is made on the basis of the value of index i, even in a casewhere the partial program and the pattern 20 have different switchinstruction references. For example, in a case where a partial programincludes an instruction “switch” (map1 [ch]), the target partial programdetecting unit 110 may detect the partial program as the partial program40, on condition that the array variable map1 corresponds to a constantarray.

Processing in accordance with the resultant partial program 60 will bedescribed. According to a while instruction (3) and a TRT instruction(5) in the resultant partial program 60, the computer scans on a 256byte basis the storage area determined by the variable “data”. The TRTinstruction (5) can be executed at an extremely high speed in comparisonwith the process in which loop processing is repeated 256 times.Therefore, the speed of scanning of the storage area determined by thevariable “data” can be increased. For example, in a case where 0 orGREATERTHAN is stored within initial 256 bytes in the storage area,instructions (1) to (9) are executed in this order. Thus, loopprocessing is not executed and, therefore, the efficiency is markedlyhigh.

According to the first example, as described above, the instructionsequence replacing unit 130 can replace processing realized by two ormore instructions such as a while instruction and a switch instructionwith a TRT instruction which is one instruction for performing the sameprocessing as that performed by multiple instructions.

As a modification of the first example, a case is conceivable in whichthe pattern 20 includes a nullcheck instruction for determining whetheror not the value of the variable “bytearray” is null. For example, thenullcheck instruction is ordinarily used immediately before execution ofinstruction (1) each time instruction (1) is executed. The nullcheckinstruction is used for the purpose of preventing readout of the valuefrom an invalid address by instruction (1).

The value of byte array is constant independently of repetition of theloop. Therefore the result of execution of the nullcheck instruction isthe same independently of repetition of the loop. In such a case, theinstruction sequence transforming unit 120 executes the nullcheckinstruction out of the loop processing and, therefore, the instructionsequence replacing unit 130 can replace the loop processing from whichthe nullcheck instruction has been removed with a target instructionsequence.

SECOND EXAMPLE

FIG. 8 shows a second example of the partial program 40 by the compiler10. The program shown in FIG. 8( a) is a source program representingdetails of processing in accordance with the partial program 40 isshown. FIG. 8( b) is a control flow graph of the partial program 40 isshown. According to the partial program 40, the computer scans in ordera storage area determined by a variable “bytes”, with respect to theindex indicated by a variable “offset”. Loop processing is terminatedwhen a negative value is detected.

In the partial program 40, loop processing has two induction variables:the variable “offset” and a variable “count”. That is, the partialprogram 40 and the pattern 20 shown in FIG. 5( a) apparently differ inprogram structure from each other. Therefore, the conventional compilercannot recognize the partial program 40 as a program to be optimized.

According to the compiler 10 in this embodiment, the target partialprogram detecting unit 110 can detect a partial program as the partialprogram 40 even if the partial program has an additional instruction incomparison with the pattern 20. More specifically, the target partialprogram detecting unit 10 can obtain a detection result showing thatinstruction (1) in FIG. 5( b) corresponds to instruction (a) in FIG. 8(b), a detection result showing that instruction (2) in FIG. 5( b)corresponds to instruction (b) in FIG. 8( b), and a detection resultshowing that instruction (3) in FIG. 5( b) corresponds to instruction(c) in FIG. 6( b).

In this case, the instruction sequence transforming unit 120 generatesnew loop processing to execute an additional instruction. Consequently,the instruction sequence replacing unit 130 can replace instructionsother than the additional instruction in the program to be optimizedwith a target instruction sequence. The compiler 10 may further optimizethe newly generated loop processing. That is, the compiler 10 canoptimize the newly generated loop processing into processing forcomputing the value of the variable “count” from the value of thevariable “offset”.

FIG. 9 shows the resultant partial program 60 in the second example. Theinstruction sequence replacing unit 130 generates a storage areaindicated by the variable “table” when a target instruction sequence isgenerated. For example, if the value of index i in the variable “bytes”storage area is a negative value, the value of index i in the “table”storage area is an on-zero value. The instruction sequence replacingunit 130 initializes other values in the “table” storage area to 0.

Processing in accordance with the resultant partial program 60 will bedescribed. According to a while instruction (3) and a TRT instruction(5) in the resultant partial program 60, the computer scans on a 256byte basis the storage area determined by the variable “bytes”. The TRTinstruction (5) can be executed at an extremely high speed in comparisonwith the process in which loop processing is repeated 256 times.Therefore, the speed of scanning of the storage area determined by thevariable “bytes” can be increased. For example, in a case where anegative value is stored within initial 256 bytes in the storage area,instructions (1) to (9) are executed in this order. Thus, loopprocessing is not executed and, therefore, the efficiency is markedlyhigh.

THIRD EXAMPLE

FIG. 10 shows a third example of the partial program 40 by the compiler10. This figure shows a source program representing details ofprocessing in accordance with the partial program 40. According to thepartial program 40, the computer scans in order a storage areadetermined by a variable “bytes” with respect to the index indicated bya variable “offset”. Loop processing is terminated when a negative valueis detected. In this loop processing, a storage area determined by avariable “a” is initialized in order from the top.

In the partial program 40, loop processing has two induction variables:the variable “offset” and a variable “count”. That is, the partialprogram 40 and the pattern 20 shown in FIG. 5( a) apparently differ inprogram structure from each other. In this case, the conventionalcompiler cannot recognize the partial program 40 as a program to beoptimized.

According to the compiler 10 in this embodiment, the target partialprogram detecting unit 110 can detect a partial program as the partialprogram 40 even if the partial program has an additional instruction incomparison with the pattern 20. Accordingly, the instruction sequencetransforming unit 120 divides the loop processing of the partial program40 into two loop processings in which the additional instruction and theinstruction sequence other than the additional instruction arerespectively executed.

FIG. 11 shows the resultant partial program 60 in the third example. Theprogram shown in FIG. 11( a) is the resultant partial program 60including a target instruction sequence substituted by the instructionsequence replacing unit 130. A while instruction and a conditionalbranch instruction are replaced with a TRT instruction, as are thoseshown in FIG. 9. The additional instruction to initialize the storagearea determined by the variable “a” is executed in loop processings (2)and (3) newly generated.

FIG. 11( b) shows the resultant partial program 60 further optimized bythe compiler 10. As shown in this figure, the compiler 10 may optimizethe loop processing that initializes the storage area determined by thevariable “a” into an XC instruction. According to the XC instruction, astorage area of a predetermined size can be initialized by apredetermined value. The XC instruction is executed at an extremely highspeed in comparison with the processing that initializes in order a 256byte storage area in a loop processing manner. Therefore, the program tobe optimized can be optimized further effectively.

The XC instruction is capable of initializing a storage area of a sizedesignated by a constant operand. For example, instruction (1) is an XCinstruction for initializing a storage area of a constant size of 256bytes. Further, according to the EXECUTE instruction in accordance withS/390 provided by IBM Corporation, a value designated by a constantoperand can be changed during execution of a program (see pp. 7-108 ofNon-Patent Document 4). Thus, the XC instruction is substantiallycapable of initializing a storage area of a size designated by aregister. For example, instruction (3) in this figure represents an XCinstruction such that a constant operand which designates the size of astorage area to be initialized is changed to the value of a variableT_inccount by the EXECUTE instruction.

FOURTH EXAMPLE

FIG. 12 shows a fourth example of the partial program 40 by the compiler10. The program shown in FIG. 12( a) is a source program representingdetails of processing in accordance with the partial program 40. FIG.12( b) is a control flow graph of the partial program 40. According tothe partial program 40, the computer counts the number of bits 1 in datastored in a variable “input” and stores the count value in a variable“output”. This program is not efficient since the processing time isincreased in proportion to the number of bits in the variable “input”.

FIG. 13 shows the resultant partial program 60 in the fourth example.The instruction sequence replacing unit 130 replaces the partial program40 shown in FIG. 12 with the resultant partial program 60 shown in FIG.13. According to the resultant partial program 60, the computer canexecute the number of bits 1 in the variable “input” at a higher speedin comparison with the partial program 40. In this way, the instructionsequence replacing unit 130 may perform not only processing forreplacement with a particular instruction but also processing forreplacing an algorithm. That is, the instruction sequence replacing unit130 may replace an instruction sequence for processing based on analgorithm requiring a longer processing time with an instructionsequence for processing based on a different algorithm requiring ashorter processing time.

FIG. 14 is a diagram for explaining the effect of this embodiment.Comparison between the speed of the partial program 40 shown in FIG. 6and the speed of the resultant partial program 60 shown in FIG. 7 wasmade. The table in FIG. 14 shows the rate at which the speed of theresultant partial program 60 is increased relative to that of thepartial program 40. While the increase rate with respect to 256 dataitems are shown in this figure, it has also been confirmed that theincrease rate is further improved with respect to a number of data itemsexceeding 256.

It can be understood that, as shown in the figure, the efficiency ofexecution of the program can be improved by optimization in a case whereeight or more data items on average are scanned. That is, for example,the compiler 10 may select and optimize only loop processing highlyprobable to scan eight or more data items to improve the efficiency ofexecution of the entire program.

FIG. 15 shows an example of a hardware configuration of a computer 500which functions as the compiler 10. The computer 500 has a CPUperipheral section having a CPU 1000, a RAM 1020 and a graphiccontroller 1075 connected to each other by a host controller 1082, aninput/output section having a communication interface 1030, a hard diskdrive 1040 and a CD-ROM drive 1060 connected to the host controller 1082by an input/output controller 1084, and a legacy input/output sectionhaving a BIOS 1010, a flexible disk drive 1050 and an input/output chip1070 connected to the input/output controller 1084.

The host controller 1082 connects the RAM 1020, and the CPU 1000 and thegraphic controller 1075, which access the RAM 1020 at a high transferrate. The CPU 1000 operates on the basis of programs stored in the BIOS1010 and the RAM 1020, and controls each component. The graphiccontroller 1075 obtains image data generated, for example, by the CPU1000 on a frame buffer provided in the RAM 1020, and displays the imagedata on a display device 1080. Alternatively, the graphic controller1075 may contain therein a frame buffer for storing image data generatedby the CPU 1000 for example.

The input/output controller 1084 connects the host controller 1082, thecommunication interface 1030, which is an input/output device of acomparatively high speed, the hard disk drive 1040 and the CD-ROM drive1060. The communication interface 1030 performs communication with anexternal unit via a network. The hard disk drive 1040 stores programsand data used by the computer 500. The CD-ROM drive 1060 reads a programor data from a CD-ROM 1095 and provides the read program or data to theinput/output chip 1070 via the RAM 1020.

To the input/output controller 1084, the BIOS 1010 and input/outputdevices of a comparatively low speed, i.e., the flexible disk drive 1050and the input/output chip 1070 or the like are also connected. The BIOS1010 stores programs including a boot program executed by the CPU 1000at the time of startup of the computer 500 and programs dependent on thehardware of the computer 500. The flexible disk drive 1050 reads aprogram or data from a flexible disk 1090 and provides the read programor data to the input/output chip 1070 via the RAM 1020. The input/outputchip 1070 connects the flexible disk 1090 and various input/outputdevices, for example, through a parallel port, a serial port, a keyboardport, a mouse port, etc.

A program provided to the computer 500 is provided by a user in a stateof being stored on a recording medium, such as the flexible disk 1090,the CD-ROM 1095, or an IC card. The program is read out from therecording medium, installed in the computer 500 via the input/outputchip 1070 and/or the input/output controller 1084, and executed in thecomputer 500. Operations which the computer 500 is made by the thisprogram, e.g., the compiler program to perform are the same as theoperations in the computer 500 described above with reference to FIGS. 1to 14. Therefore, description of the operations will not be repeated.

The above-described program may be stored on an external storage medium.As the recording medium, an optical recording medium such as a DVD or aPD, a magneto-optic recording medium such as an MD, a tape medium, asemiconductor memory such as an IC card, or the like can be used as wellthe flexible disk 1090 and the CD-ROM 1095. Also, a storage device suchas a hard disk, a RAM or the like provided in a server system connectedto a special-purpose communication network or the Internet may be usedas the recording medium to provide the program to the computer 500 viathe network.

While the present invention has been described with respect to theembodiment thereof, the technical scope of the present invention is notlimited to the scope described above with respect to the embodiment. Itis apparent to those skilled in the art that various changes andmedications can be made in the above-described embodiment. It isapparent from the description in the appended claims that otherembodiments of the invention provided by making such changes andmodifications are also included in the technical scope of the presentinvention.

Variations described for the present invention can be realized in anycombination desirable for each particular application. Thus particularlimitations, and/or embodiment enhancements described herein, which mayhave particular advantages to a particular application need not be usedfor all applications. Also, not all limitations need be implemented inmethods, systems and/or apparatus including one or more concepts of thepresent invention.

The present invention can be realized in hardware, software, or acombination of hardware and software. A visualization tool according tothe present invention can be realized in a centralized fashion in onecomputer system, or in a distributed fashion where different elementsare spread across several interconnected computer systems. Any kind ofcomputer system—or other apparatus adapted for carrying out the methodsand/or functions described herein—is suitable. A typical combination ofhardware and software could be a general purpose computer system with acomputer program that, when being loaded and executed, controls thecomputer system such that it carries out the methods described herein.The present invention can also be embedded in a computer programproduct, which comprises all the features enabling the implementation ofthe methods described herein, and which—when loaded in a computersystem—is able to carry out these methods.

Computer program means or computer program in the present contextinclude any expression, in any language, code or notation, of a set ofinstructions intended to cause a system having an information processingcapability to perform a particular function either directly or afterconversion to another language, code or notation, and/or reproduction ina different material form.

Thus the invention includes an article of manufacture which comprises acomputer usable medium having computer readable program code meansembodied therein for causing a function described above. The computerreadable program code means in the article of manufacture comprisescomputer readable program code means for causing a computer to effectthe steps of a method of this invention. Similarly, the presentinvention may be implemented as a computer program product comprising acomputer usable medium having computer readable program code meansembodied therein for causing a function described above. The computerreadable program code means in the computer program product comprisingcomputer readable program code means for causing a computer to effectone or more functions of this invention. Furthermore, the presentinvention may be implemented as a program storage device readable bymachine, tangibly embodying a program of instructions executable by themachine to perform method steps for causing one or more functions ofthis invention.

It is noted that the foregoing has outlined some of the more pertinentobjects and embodiments of the present invention. This invention may beused for many applications. Thus, although the description is made forparticular arrangements and methods, the intent and concept of theinvention is suitable and applicable to other arrangements andapplications. It will be clear to those skilled in the art thatmodifications to the disclosed embodiments can be effected withoutdeparting from the spirit and scope of the invention. The describedembodiments ought to be construed to be merely illustrative of some ofthe more prominent features and applications of the invention. Otherbeneficial results can be realized by applying the disclosed inventionin a different manner or modifying the invention in ways known to thosefamiliar with the art.

1) An optimizing compiler detecting a pattern that is to be replaced andincludes multiple predetermined instructions in a program to beoptimized and replacing the detected pattern to be replaced with atarget instruction sequence determined in accordance with the pattern tobe replaced, comprising: a target partial program detecting unit fordetecting, from among a partial programs of said program to beoptimized, a partial program including instructions corresponding to allinstructions included in said pattern to be replaced as a partialprogram to be optimized; an instruction sequence transforming unit fortransforming, in said partial program to be optimized, instructionsother than those instructions corresponding to instructions included insaid pattern to be replaced and those instructions having executiondependencies different from said pattern to be replaced, so thatdependencies between instructions included in said partial program to beoptimized match said pattern to be replaced; and an instruction sequencereplacing unit for replacing said partial program to be optimizedtransformed by said instruction sequence transforming unit with a targetinstruction sequence determined in accordance with said pattern to bereplaced. 2) The compiler according to claim 1, wherein said targetpartial program detecting unit detects each of the partial programs asthe partial program to be optimized, if the instructions in the partialprogram corresponding to all the instructions included in said patternto be replaced are executed in the order designated by the executiondependencies between the instructions in said pattern to be replaced. 3)The compiler according to claim 2, wherein in a case where said patternto be replaced includes loop processing, said target partial programdetecting unit detects each of the partial programs as the partialprogram to be optimized, if the partial program includes in loopprocessing the instructions corresponding to all the instructionsincluded in said loop processing, and if all the instructions out of theloop processing in the partial program conform to the dependenciesdetermined by said pattern to be replaced. 4) The compiler according toclaim 1, wherein said pattern to be replaced is a dependence graphhaving as a node each of multiple instructions included in the saidpattern to be replaced and having directed edges representing theexecution dependencies between multiple instructions, and wherein saidtarget partial program detecting unit generates, with respect to each ofthe partial programs, a dependence graph having as a node each ofmultiple instructions included in the partial program and havingdirected edges representing the execution dependencies between multipleinstructions, and makes a determination on the basis of the dependencegraph and a dependence graph representing said pattern to be replaced asto whether or not the partial program should be detected as the partialprogram to be optimized. 5) The compiler according to claim 4, whereinsaid pattern to be replaced indicate control flows between theinstructions, wherein the dependence graph of said pattern to bereplaced has, with respect to a multiple-branch instruction to hand overcontrol to an external instruction out of said pattern to be replaced ina case where one of multiple conditions is satisfied, a representativeedge representative of multiple control flows through which control ishanded over from the multiple-branch instruction to the externalinstruction, and wherein said target partial program detecting unitdetermines, with respect to each of multiple partial programs, that thepartial program includes the corresponding multiple-branch instruction,if the dependence graph showing control flows of the partial program hasan edge corresponding to the representative edge. 6) The compileraccording to claim 1, further comprising an optimization candidatedetecting unit which detects, in the program to be optimized, as acandidate for the partial program to be optimized, the partial programincluding a memory access instruction to access the same type of data asdata to be accessed according to a memory access instruction included insaid pattern to be replaced, wherein said target partial programdetecting unit detects the partial program to be optimized in thepartial programs detected by said optimization candidate detecting unit.7) The compiler according to claim 1, further comprising an optimizationcandidate detecting unit which detects, in the program to be optimized,as a candidate for the partial program to be optimized, the partialprogram including loop processing having the same increment in a loopinduction variable as that in loop processing included in said patternto be replaced, wherein said target partial program detecting unitdetects the partial program to be optimized in the partial programsdetected by said optimization candidate detecting unit. 8) The compileraccording to claim 1, further comprising an optimization candidatedetecting unit which detects, in the program to be optimized, as acandidate for the partial program to be optimized, the partial programincluding loop processing repeatedly performed a number of times equalto or larger than a predetermined reference number of times, whereinsaid target partial program detecting unit detects the partial programto be optimized in the partial programs detected by said optimizationcandidate detecting unit. 9) The compiler according to claim 1, whereinsaid target partial program detecting unit detects, as the partialprogram to be optimized, one of the partial programs including in loopprocessing an additional instruction which does not correspond to any ofthe instructions included in said pattern to be replaced, wherein saidinstruction sequence transforming unit executes the additionalinstruction out of the loop processing of the partial program to beoptimized, on condition that the result of execution of the additionalinstruction included in the loop processing is constant independently ofrepetition of the loop processing, and wherein said instruction sequencereplacing unit replaces the loop processing from which the additionalinstruction has been removed with the target instruction sequence. 10)The compiler according to claim 1, wherein said target partial programdetecting unit detects, as the partial program to be optimized, one ofthe partial programs including in loop processing an additionalinstruction which does not correspond to any of the instructionsincluded in said pattern to be replaced, wherein said instructionsequence transforming unit divides the loop processing of the partialprogram to be optimized into two loop processes in which the additionalinstruction and the instruction sequence other than the additionalinstruction are respectively executed, and wherein said instructionsequence replacing unit replaces the instruction sequence other than theadditional instruction with the target instruction sequence. 11) Thecompiler according to claim 1, wherein said target partial programdetecting unit detects, as the partial program to be optimized, one ofthe partial programs including an instruction sequence executed in anorder different from that of the execution dependencies in said patternto be replaced, wherein said instruction sequence transforming unitchanges, on the basis of the execution dependencies, the order ofexecution of the instruction in the partial program to be optimized, oncondition that the result of processing of the partial program to beoptimized is not changed even if the order of execution of theinstructions in the partial program to be optimized is changed, andwherein said instruction sequence replacing unit replaces the partialprogram to be optimized in which the instruction execution order hasbeen changed with the target instruction sequence. 12) The compileraccording to claim 1, wherein if one of the partial programs lacks someof the instructions included in said pattern to be replaced, said targetpartial program detecting unit detects the partial program as thepartial program to be optimized, on condition that the proportion of theinstructions in the partial program corresponding to the otherinstructions included in said pattern to be replaced in all theinstructions included in said pattern to be replaced is equal to orhigher than a predetermined reference proportion. 13) The compileraccording to claim 12, wherein said instruction sequence transformingunit adds to the partial program to be optimized the absent instructionabsent in the instructions in the partial program to be optimizedcorresponding to all the instructions included in said pattern to bereplaced, generates a cancel instruction to return the result ofprocessing of the program to be optimized changed by the addition of theabsent instruction to the processing result obtained in the case wherethe absent instruction is not added, and executes the cancel instructionout of the partial program to be optimized, and wherein said instructionsequence replacing unit replaces the partial program to be optimizedincluding the added instruction that has been absent with the targetinstruction sequence. 14) The compiler according to claim 13, whereinsaid instruction sequence transforming unit generates as the cancelinstruction a save instruction to save, before the absent instruction,the value in a storage area in which the result of processing accordingto the absent instruction is stored, and a recovery instruction torecover the value saved in the storage area after execution of theabsent instruction, and executes the save instruction and the recoveryinstruction out of the partial program to be optimized. 15) The compileraccording to claim 12, wherein said target partial program detectingunit detects each of the partial programs as the partial program to beoptimized, if the processing time increased in a case where theinstruction absent in the partial program in comparison with saidpattern to be replaced is added to the partial program is shorter thanthe reduced processing time in a case where the partial program isreplaced with the target instruction sequence by said instructionsequence transforming unit. 16) The compiler according to claim 1,wherein said instruction sequence replacing unit comprises a limitationselected from a group of limitations consisting of: said instructionsequence replacing unit generates the target instruction sequence byreplacing each of variables in the target instruction template showingthe structure of the target instruction with a variable in the partialprogram to be optimized corresponding to the variable in the targetinstruction template; said instruction sequence replacing unit replacestwo or more of the instructions of the partial program to be optimizedwith one instruction for performing the same processing as thataccording to the two or more instructions; said instruction sequencereplacing unit replaces an instruction sequence for processing based onan algorithm requiring a longer processing time with an instructionsequence for processing based on a different algorithm requiring ashorter processing time; and any combination of these limitations. 17) Acomputer program product comprising a computer usable medium havingcomputer readable program code means embodied therein for causingfunctions of an optimizing compiler, the computer readable program codemeans in said computer program product comprising computer readableprogram code means for causing a computer to effect the functions ofclaim
 1. 18) An optimization method for optimizing a program to beoptimized detecting, from a program to be optimized, a pattern that isto be replaced and includes multiple predetermined instructions andreplacing the detected pattern to be replaced with a target instructionsequence determined in accordance with said instruction sequence to bereplaced, comprising: a target partial program detecting step ofdetecting a partial program as said target partial program to beoptimized if it is determined that instructions in said partial programthat correspond to all instructions included in said pattern to bereplaced are executed in an execution order indicated by executiondependencies between instructions in said pattern to be replaced; and aninstruction sequence replacing step of replacing said partial program tobe optimized with a target instruction sequence determined in accordancewith said pattern to be replaced. 19) A compiler detecting, from aprogram to be optimized, a pattern that is to be replaced and includesmultiple predetermined instructions and replacing the detected patternto be replaced with a target instruction sequence determined inaccordance with said pattern to be replaced, comprising: a targetpartial program detecting unit for detecting a partial program as saidtarget partial program to be optimized if it is determined thatinstructions in said partial program that correspond to all instructionsincluded in said pattern to be replaced are executed in an executionorder indicated by execution dependencies between instructions in saidpattern to be replaced; and an instruction sequence replacing unit forreplacing said partial program to be optimized with a target instructionsequence determined in accordance with said pattern to be replaced. 20)The compiler according to claim 19, wherein in a case where said patternto be replaced determines dependencies recurring among the instructions,said target partial program detecting unit detects, as the partialprogram to be optimized, one of the partial programs including aninstruction sequence having the same dependencies as those determined bysaid pattern to be replaced but differing in recurrence phase, andwherein said instruction sequence replacing unit replaces the partialprogram to be optimized with the target instruction sequence. 21) Anoptimization method for detecting a pattern that is to be replaced andincludes multiple predetermined instructions in a program to beoptimized and replacing the detected pattern to be replaced with atarget instruction sequence determined in accordance with theinstruction sequence to be replaced, comprising: a target partialprogram detecting step of detecting, from among a partial programs ofsaid program to be optimized, a partial program including instructionscorresponding to all instructions included in said pattern to bereplaced as a partial program to be optimized; an instruction sequencetransforming step of transforming, in said partial program to beoptimized, instructions other than those instructions corresponding toinstructions included in said pattern to be replaced and thoseinstructions having execution dependencies different from said patternto be replaced, so that dependencies between instructions included insaid partial program to be optimized match said pattern to be replaced;an instruction replacing step of replacing said partial program to beoptimized transformed by said instruction sequence transforming unitwith a target instruction sequence determined in accordance with saidpattern to be replaced. 22) An article of manufacture comprising acomputer usable medium having computer readable program code meansembodied therein for causing detection by a compiler, the computerreadable program code means in said article of manufacture comprisingcomputer readable program code means for causing a computer to effectthe steps of claim
 19. 23) A compiler program for causing a computer tofunction as a compiler detecting a pattern that is to be replaced andincludes multiple predetermined instructions in a program to beoptimized and replacing the detected pattern to be replaced with atarget instruction sequence determined in accordance with theinstruction sequence to be replaced, said compiler program causing saidcomputer as: a target partial program detecting unit for detecting, fromamong a partial programs of said program to be optimized, a partialprogram including instructions corresponding to all instructionsincluded in said pattern to be replaced as a partial program to beoptimized; an instruction sequence transforming unit for transforming,in said partial program to be optimized, instructions other than thoseinstructions corresponding to instructions included in said pattern tobe replaced and those instructions having execution dependenciesdifferent from said pattern to be replaced, so that dependencies betweeninstructions included in said partial program to be optimized match saidpattern to be replaced; an instruction sequence replacing unit forreplacing said partial program to be optimized transformed by saidinstruction sequence transforming unit with a target instructionsequence determined in accordance with said pattern to be replaced. 24)A compiler program for causing a computer to function as a compilerdetecting a pattern that is to be replaced and includes multiplepredetermined instructions in a program to be optimized and replacingthe detected pattern to be replaced with a target instruction sequencedetermined in accordance with the instruction sequence to be replaced,said compiler program causing said computer as: a target partial programdetecting unit for detecting a partial program as said target partialprogram to be optimized if it is determined that instructions in saidpartial program that correspond to all instructions included in saidpattern to be replaced are executed in an execution order indicated byexecution dependencies between instructions in said pattern to bereplaced; and an instruction sequence replacing unit for replacing saidpartial program to be optimized with a target instruction sequencedetermined in accordance with said pattern to be replaced. 25) Arecording medium on which the compiler program according to claim 23 isrecorded.